這 指令集架構(ISA) 作為一個基本的 抽象層級 ,也是軟體與硬體之間的正式契約。雖然高階語言如 C 能隱藏複雜性,但 ISA 會揭示 架構狀態——即處理器暫存器與記憶體的精確配置。
1. 架構狀態
x86-64 處理器透過幾個關鍵組件來定義其狀態:
- 程式計數器(%rip): 儲存下一個指令的位址。
- 整數暫存器檔案: 16 個通用暫存器(例如,
%rax、%rbx)用於儲存 64 位元的值。 - 條件碼: 用於控制流程的旗標(ZF、SF、CF、OF)。
- 向量暫存器: 例如 YMM 暫存器 (256 位元)用於 SIMD 運算。
2. 記憶體抽象
機器碼將記憶體視為一個巨大的 字節可尋址陣列。雖然 x86-64 支援 64 位元虛擬位址,但目前的實作通常使用 48 位元位址空間($2^{48}$ 字節)。我們將資料大小分類為 字(Word) (16 位元), 雙字(Double word) (32 位元),以及 四字(Quad word) (64 位元)。
3. 發展與相容性
受 摩爾定律驅動,英特爾已從 8086 演進到 Core i7 Haswell。ISA 確保 向後相容性,使舊有的機器碼可在現代多核心、超執行緒的硬體上執行。
main.py
TERMINALbash — 80x24
> Ready. Click "Run" to execute.
>
QUESTION 1
Practice Problem 2.52: Convert Format A bits
0 011 000 (Sign:1, Exp:3, Frac:3) to the closest value in Format B (Sign:1, Exp:2, Frac:4). Use round-to-even.0 01 00000 10 00000 11 10000 00 1111✅ Correct!
In Format A (bias 3), 011 is exponent 0 ($2^0$), so value is 1.0. In Format B (bias 1), exponent 0 is represented by 01. 1.0 is 0 01 0000.❌ Incorrect
Calculate the actual value first: $(-1)^s \times 1.f \times 2^{E}$. Format A bias is 3 ($2^{k-1}-1$).QUESTION 2
Which of the following assembly lines is valid for a 64-bit system?
movb %ebx, (%rax)movq %rax, $0x123movl %eax, (%rsp)movw %cl, %ax✅ Correct!
movl %eax, (%rsp) is valid; it moves a 32-bit double word into memory. Others fail due to size mismatches (ebx is 32-bit, movb is 8-bit) or trying to move into an immediate value.❌ Incorrect
Review register sizes: %al=8, %ax=16, %eax=32, %rax=64. The suffix must match the register size.QUESTION 3
If a processor requires 25 cycles for predictable branches and 45 cycles for random branches, what is the Branch Misprediction Penalty ($T_{MP}$)?
20 cycles
40 cycles
10 cycles
65 cycles
✅ Correct!
Using $T_{avg} = T_{OK} + p \times T_{MP}$, for $p=0.5$ (random), $45 = 25 + 0.5 \times T_{MP}$. Thus $20 = 0.5 \times T_{MP}$, so $T_{MP} = 40$.❌ Incorrect
The average time for random branching includes a 50% chance of misprediction.QUESTION 4
In the 'Guarded-Do' transformation of a while loop, what happens first?
The loop body executes once unconditionally.
An initial conditional branch checks if the loop should be skipped entirely.
The loop is converted into a switch statement.
The compiler uses a jump-to-middle strategy.
✅ Correct!
Guarded-Do uses an initial 'if' (the guard) to check the condition before entering the 'do-while' style loop body.❌ Incorrect
Guarded-Do differs from Jump-to-Middle by placing the conditional test at the start of the entire construct.QUESTION 5
What is the primary function of the
leaq instruction when not accessing memory?To perform fast arithmetic like
x + k*y.To clear the condition code registers.
To move data between XMM registers.
To sign-extend a byte to a quad word.
✅ Correct!
leaq (Load Effective Address) uses address-computation hardware to perform additions and shifts without referencing memory.❌ Incorrect
While it uses memory operand syntax, it does not dereference the address.Case Study: Complex Argument Handling
Analysis of Procedure Calls in ISO C99
You are reverse engineering a C program that uses complex numbers (ISO C99). A function
complex_add receives two 128-bit structures (representing real and imaginary parts) and returns a result. In x86-64, you notice that %rdi is used for an address, even though the C prototype doesn't show an explicit pointer argument.Q
How are large complex structures typically passed as arguments to functions?
Solution:
While the first six small arguments (integers/pointers) fit in registers like %rdi and %rsi, large structures (like 128-bit complex types) are often passed by copying them onto the stack, or by the caller passing a pointer to the structure in a register.
While the first six small arguments (integers/pointers) fit in registers like %rdi and %rsi, large structures (like 128-bit complex types) are often passed by copying them onto the stack, or by the caller passing a pointer to the structure in a register.
Q
How are complex values or large structs returned from a function in x86-64?
Solution:
For values larger than 64 bits that cannot fit in %rax, the caller allocates space on its own stack frame and passes the address of this space in %rdi. The callee then writes the return value directly to that memory location and returns the address in %rax.
For values larger than 64 bits that cannot fit in %rax, the caller allocates space on its own stack frame and passes the address of this space in %rdi. The callee then writes the return value directly to that memory location and returns the address in %rax.